The Effect of Extreme Poverty on Womens Fertility Rates

Author

Ainsley Forster, Sophia Taylor

Published

March 7, 2025

Introduction

Show load-and-clean-data code
library(tidyverse)
library(readr)
library(gganimate)
library(gifski)
library(knitr)

data_children_per_woman <- read_csv("/Users/sophietaylor/Documents/STAT 331/children_per_woman_total_fertility.csv") |>
  pivot_longer(cols = "1800":"2100",
               names_to = "year",
               values_to = "children_per_woman")

data_extreme_poverty_rate <- read_csv("/Users/sophietaylor/Documents/STAT 331/gm_epov_rate.csv") |>
  pivot_longer(cols = "1800":"2100",
               names_to = "year",
               values_to = "extreme_poverty_rate")

clean_data <- data_children_per_woman |>
  full_join(data_extreme_poverty_rate, by = c("country", "year")) |>
  drop_na()

Data and Variables: This project examines the relationship between extreme poverty and fertility rates using Gapminder data: Children per Woman (Total Fertility Rate): The average number of children a woman has and Extreme Poverty Rate: The percentage of people living below the international poverty line.

Hypothesis: We hypothesize that higher poverty rates correlate with higher fertility rates due to economic necessity and limited access to contraception. Declining poverty is expected to lower fertility due to improved education, healthcare, and economic opportunities.

Data Cleaning Steps: 1. Reshaped Data: Converted from wide to long format using pivot_longer(). 2. Merged Datasets: Joined on country and year. 3. Handled Missing Values: Removed incomplete observations with drop_na().

Considerations: There is limited historical data. As well, other socioeconomic factors can also impact fertility but are not included.

Data Visualization

Show poverty-and-fertility-over-time code
poverty_and_fertility_over_time <- clean_data |>
  ggplot(mapping = aes(x = extreme_poverty_rate,
                       y = children_per_woman)
         ) +
  geom_point(aes(group = seq_along(year))) + 
  transition_states(year,
                    transition_length = 1,
                    state_length = 1
                    )+
  theme_light()+
  labs(title = "Extreme Poverty and Womens Fertility Over Time",
       subtitle = "Fertility Rate (children per woman)",
       x = "Extreme Poverty Rate (%)",
       y = "")+
  geom_text(aes(x = 85, y = 1.5, label = paste("Year:", year)), 
            size = 5)

animate(poverty_and_fertility_over_time)

The animated visualization illustrates how extreme poverty and fertility rates change over time. The animation reveals a general downward trend in both variables, indicating that as time progresses, fertility rates and poverty levels tend to decrease. This supports the hypothesis that economic development and improved access to resources contribute to lower fertility rates.

Show poverty-vs-fertility-rates code
clean_data2 <- clean_data |>
  group_by(country) |>
  summarise(mean_fertility = mean(children_per_woman),
            mean_poverty = mean(extreme_poverty_rate)
            )

poverty_vs_fertility_rates <- clean_data2 |>
  ggplot(mapping = aes(x = mean_poverty,
                       y = mean_fertility)
         ) +
  geom_point() +
  theme_light()+
  labs(title = "Extreme Poverty vs. Womens Fertility",
       subtitle = "Fertility Rate (children per woman)",
       x = "Extreme Poverty Rate (%)",
       y = "")

plot(poverty_vs_fertility_rates)

The scatterplot demonstrates a positive relationship between extreme poverty rates and fertility rates across countries. Countries with higher poverty rates tend to have higher fertility rates, whereas countries with lower poverty rates tend to have fewer children per woman. The spread of points suggests some variability, but the overall trend indicates that poverty is a significant influence on fertility.

Linear Regression

Show SLR code
slr_model = lm(mean_fertility ~ mean_poverty, data = clean_data2)

print(paste("Fertility_Rate =",
            round(coef(slr_model)[1], 3),
            "+",
            round(coef(slr_model)[2], 3),
            "* Poverty_Rate")
      )
[1] "Fertility_Rate = 2.981 + 0.036 * Poverty_Rate"

The estimated regression equation is: “Fertility_Rate = 2.981 + 0.036 * Poverty_Rate” where: 2.981 (intercept) represents the predicted fertility rate when the extreme poverty rate is zero. 0.036 (slope) represents the expected increase in fertility rate for a increase in extreme poverty rate. The intercept suggests that even in the absence of extreme poverty, fertility rates are still about 3 children. The slope coefficient indicates that as extreme poverty increases by 1 percentage, the fertility rate is expected to increase by 0.036 children per woman.

Model Fit

Code
var_in_response <- var(clean_data2$mean_fertility)
var_in_fitted <- var(fitted(slr_model))
var_in_residuals <- var(slr_model$residuals)

variation_table <- data.frame(
  "Variance Type" = c("Response", "Fitted", "Residuals"),
  "Variance Value" = c(var_in_response, var_in_fitted, var_in_residuals)
  )

print(variation_table)
  Variance.Type Variance.Value
1      Response      0.8557724
2        Fitted      0.3640176
3     Residuals      0.4917549

The proportion of variability (R^2) is calculated by 0.3640/0.8558 = 0.425. This suggests that about 42.5% of the variation in fertility rates can be explained by extreme poverty. The remaining variation is most likely influenced by other socioeconomic and cultural factors not included in this model.